Conversion from facial myoelectric signals to speech: a unit selection approach
نویسندگان
چکیده
This paper reports on our recent research on surface electromyographic (EMG) speech synthesis: a direct conversion of the EMG signals of the articulatory muscle movements to the acoustic speech signal. In this work we introduce a unit selection approach which compares segments of the input EMG signal to a database of simultaneously recorded EMG/audio unit pairs and selects the best matching audio unit based on target and concatenation cost, which will be concatenated to synthesize an acoustic speech output. We show that this approach is feasible to generate a proper speech output from the input EMG signal. We evaluate different properties of the units and investigate what amount of data is necessary for an initial transformation. Prior work on EMG-to-speech conversion used a framebased approach from the voice conversion domain, which struggles with the generation of a natural F0 contour. This problem may also be tackled by our unit selection approach.
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملCodebook clustering for unit selection based EMG-to-speech conversion
This paper reports on our recent advances in using Unit Selection to directly synthesize speech from facial surface electromyographic (EMG) signals generated by movement of the articulatory muscles during speech production. We achieve a robust Unit Selection mapping by using a more sophisticated unit codebook. This codebook is generated from a set of base units using a two stage unit clustering...
متن کاملExemplar-based unit selection for voice conversion utilizing temporal information
Although temporal information of speech has been shown to play an important role in perception, most of the voice conversion approaches assume the speech frames are independent of each other, thereby ignoring the temporal information. In this study, we improve conventional unit selection approach by using exemplars which span multiple frames as base units, and also take temporal information con...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملQuantative Evaluation of the Efficiency of Facial Bio-potential Signals Based on Forehead Three-Channel Electrode Placement For Facial Gesture Recognition Applicable in a Human-Machine Interface
Introduction: Today, facial bio-potential signals are employed in many human-machine interface applications for enhancing and empowering the rehabilitation process. The main point to achieve that goal is to record appropriate bioelectric signals from the human face by placing and configuring electrodes over it in the right way. In this paper, heuristic geometrical position and configuration of ...
متن کامل